data-driven study
A Data-Driven Study of Commonsense Knowledge using the ConceptNet Knowledge Base
Acquiring commonsense knowledge and reasoning is recognized as an important frontier in achieving general Artificial Intelligence (AI). Recent research in the Natural Language Processing (NLP) community has demonstrated significant progress in this problem setting. Despite this progress, which is mainly on multiple-choice question answering tasks in limited settings, there is still a lack of understanding (especially at scale) of the nature of commonsense knowledge itself. In this paper, we propose and conduct a systematic study to enable a deeper understanding of commonsense knowledge by doing an empirical and structural analysis of the ConceptNet knowledge base. ConceptNet is a freely available knowledge base containing millions of commonsense assertions presented in natural language.
How Do Crowdworker Communities and Microtask Markets Influence Each Other? A Data-Driven Study on Amazon Mechanical Turk
Yang, Jie (University of Fribourg) | Valk, Carlo van der (Delft University of Technology) | Hoßfeld, Tobias (University of Würzburg) | Redi, Judith (Delft University of Technology, Exact B.V.) | Bozzon, Alessandro (Delft University of Technology)
Crowdworker online communities — operating in fora like mTurkForum and TurkerNation — are an important actor in microwork markets. Albeit central to market dynamics, how the behavior of crowdworker communities and the dynamics of online marketplaces influence each other is yet to be understood. To provide quantitative evidence of such influence, we performed an analysis on 6-years worth of mTurk market activities and community discussions in six fora. We investigated the nature of the relationships that exist between activities in fora, tasks published in mTurk, requesters for such tasks, and task completion speed. We validate -- and expand upon — results from previous work by showing that (i) there are differences between market demand and community activities that are specific to fora and task types; (ii) the temporal progression of HIT availability in the market is predictive of the upcoming amount of crowdworker discussions, with significant differences across fora and discussion categories; (iii) activities in fora can have a significant positive impact on the completion speed of tasks available in the market.